Classification of MEDLINE Abstracts

نویسندگان

  • Katsutoshi Ibushi
  • Nigel Collier
  • Jun’ichi Tsujii
چکیده

This paper provides the preliminary result in our experiments to automatically assign MeSH terms to MEDLINE abstracts. Every year about 100,000 documents are added to MEDLINE, index terms are assigned by hand to each document from a controlled vocabulary called MeSH. This is necessarily time consuming and may lead to inconsistent indexing due to the large size of MeSH. Our purpose is to explore the feasibility of automating this indexing. To achieve the purpose, we apply two documents classification methods, based on SVMV [1] and AdaBoost [4], which show good results in classification of news corpora and analyze their results. We assumed a class consists of the abstracts which have the same MeSH term. Although MeSH terms have a hierarchical structure, each class is regarded to be independent. We used MeSH terms previously assigned by specialists as answer and compared the answer with the assigned MeSH term by application of SMVM and AdaBoost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of Clinically Useful Sentences in MEDLINE

OBJECTIVE In a previous study, we investigated a sentence classification model that uses semantic features to extract clinically useful sentences from UpToDate, a synthesized clinical evidence resource. In the present study, we assess the generalizability of the sentence classifier to Medline abstracts. METHODS We applied the classification model to an independent gold standard of high qualit...

متن کامل

Information Extraction and Sentence Classification applied to Clinical Trial MEDLINE Abstracts

In this paper, firstly we report experimental results on applying information extraction (IE) methodology to the task of summarizing clinical trial design information in focus on “Compared Treatment”, “Endpoint” and “Patient Population” from clinical trial MEDLINE abstracts. From these results, we have come to see this problem as one that can be decomposed into a sentence classification subtask...

متن کامل

Developing an Ontology for Encoding Disease Treatment Information in Medical Abstracts

A disease-treatment ontology is being developed to model and represent treatment information found in medical abstracts. Treatment information extracted from medical abstracts and medical articles can then be encoded in this ontology and used for information retrieval, question-answering, summarisation and knowledge discovery. This paper explains the initial version of the ontology developed ba...

متن کامل

Automatic Classification of PubMed Abstracts with Latent Semantic Indexing: Working Notes

The 2014 BioASQ challenge 2a tasks participants with assigning semantic tags to biomedical journal abstracts. We present a system that uses Latent Semantic Analysis to identify semantically similar documents in MEDLINE to an unlabeled abstract, and then uses a novel ranking scheme to select a list of MeSH headers from candidates drawn from the most similar documents. Our approach achieved good ...

متن کامل

Structured abstracts in MEDLINE, 1989-1991.

OBJECTIVE To characterize the structured abstracts in biomedical journals indexed in MEDLINE over a three-year period as an initial step in exploring their utility in enhancing bibliographic retrieval. DESIGN The study examined the occurrence of structured abstracts in MEDLINE from March 1989 to December 1991, characteristics of MEDLINE records for articles with structured abstracts, editoria...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999